Search CORE

257 research outputs found

Electrical Flows, Laplacian Systems, and Faster Approximation of Maximum Flow in Undirected Graphs

Author: Christiano Paul
Kelner Jonathan A.
Madry Aleksander
Spielman Daniel A.
Teng Shang-Hua
Publication venue
Publication date: 01/01/2010
Field of study

We introduce a new approach to computing an approximately maximum s-t flow in a capacitated, undirected graph. This flow is computed by solving a sequence of electrical flow problems. Each electrical flow is given by the solution of a system of linear equations in a Laplacian matrix, and thus may be approximately computed in nearly-linear time. Using this approach, we develop the fastest known algorithm for computing approximately maximum s-t flows. For a graph having n vertices and m edges, our algorithm computes a (1-\epsilon)-approximately maximum s-t flow in time \tilde{O}(mn^{1/3} \epsilon^{-11/3}). A dual version of our approach computes a (1+\epsilon)-approximately minimum s-t cut in time \tilde{O}(m+n^{4/3}\eps^{-8/3}), which is the fastest known algorithm for this problem as well. Previously, the best dependence on m and n was achieved by the algorithm of Goldberg and Rao (J. ACM 1998), which can be used to compute approximately maximum s-t flows in time \tilde{O}(m\sqrt{n}\epsilon^{-1}), and approximately minimum s-t cuts in time \tilde{O}(m+n^{3/2}\epsilon^{-3})

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Deep reinforcement learning from human preferences

Author: Amodei Dario
Brown Tom B.
Christiano Paul
Legg Shane
Leike Jan
Martic Miljan
Publication venue
Publication date: 13/07/2017
Field of study

For sophisticated reinforcement learning (RL) systems to interact usefully with real-world environments, we need to communicate complex goals to these systems. In this work, we explore goals defined in terms of (non-expert) human preferences between pairs of trajectory segments. We show that this approach can effectively solve complex RL tasks without access to the reward function, including Atari games and simulated robot locomotion, while providing feedback on less than one percent of our agent's interactions with the environment. This reduces the cost of human oversight far enough that it can be practically applied to state-of-the-art RL systems. To demonstrate the flexibility of our approach, we show that we can successfully train complex novel behaviors with about an hour of human time. These behaviors and environments are considerably more complex than any that have been previously learned from human feedback

arXiv.org e-Print Archive

A Cryptographic Test of Quantumness and Certifiable Randomness from a Single Quantum Device

Author: Brakerski Zvika
Christiano Paul
Mahadev Urmila
Vazirani Umesh
Vidick Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2018
Field of study

We give a protocol for producing certifiable randomness from a single untrusted quantum device that is polynomial-time bounded. The randomness is certified to be statistically close to uniform from the point of view of any computationally unbounded quantum adversary, that may share entanglement with the quantum device. The protocol relies on the existence of post-quantum secure trapdoor claw-free functions, and introduces a new primitive for constraining the power of an untrusted quantum device. We then show how to construct this primitive based on the hardness of the learning with errors (LWE) problem. The randomness protocol can also be used as the basis for an efficiently verifiable "quantum supremacy" proposal, thus answering an outstanding challenge in the field

Crossref

Caltech Authors